Skip to content

use 'lit' as the field name for literal values #16498

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

adriangb
Copy link
Contributor

As per #16491 (comment) I think it's a bit strange that we try to create a field name from the repr of the value.

Consider this example: https://datafusion.apache.org/user-guide/sql/scalar_functions.html#id273
For cases of an array with hundreds of elements it will blow up and make a mess!

Could we use a fixed constant like 'lit' or 'field' instead?

The main issue I could see happening is name collisions, e.g. select 1, 2, 3 will cause an error which is unfortunate, not sure how to resolve that but also the current behavior isn't much better:

> select 1, 1;
Error during planning: Projections require unique expression names but the expression "Int64(1)" at position 0 and "Int64(1)" at position 1 have the same name. Consider aliasing ("AS") one of them.

FWIW Postgres seems to have the concept of an "un-named" column:

ff=# select 1, 2, 3;
 ?column? | ?column? | ?column? 
----------+----------+----------
        1 |        2 |        3

But I'm not sure we want to introduce an "unnamed" field.

@github-actions github-actions bot added physical-expr Changes to the physical-expr crates core Core DataFusion crate labels Jun 21, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me -- thank you @adriangb

@alamb
Copy link
Contributor

alamb commented Jun 22, 2025

FYI @timsaucer

Copy link
Contributor

@timsaucer timsaucer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds reasonable, especially since the final schema will not use the field name here. If we change that at some point I could see it causing a problem.

@adriangb
Copy link
Contributor Author

adriangb commented Jun 22, 2025

Thanks folks. One more thought: I wonder if this improves performance by some small fraction given that I could see lit(..) being called in hot loops in some place and this was innocently calling a pretty complex Display impl that e.g. iterated over list literals.

Anyway I pla to leave this open for review for another day or so then I'll merge it.

@@ -66,8 +66,7 @@ impl Literal {
value: ScalarValue,
metadata: Option<FieldMetadata>,
) -> Self {
let mut field =
Field::new(format!("{value}"), value.data_type(), value.is_null());
let mut field = Field::new("lit".to_string(), value.data_type(), value.is_null());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we see string allocations on the hot path, perhaps we could do something like cache the FieldRef for each datatype or something 🤔

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(As a follow on PR, to be clera)

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Jun 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
core Core DataFusion crate physical-expr Changes to the physical-expr crates sqllogictest SQL Logic Tests (.slt)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants